Skip to content

Add test coverage as 4th Copeland criterion with anti-gaming safeguards#120

Merged
that-github-user merged 1 commit into
mainfrom
issue-119-copeland-test-criterion
Mar 29, 2026
Merged

Add test coverage as 4th Copeland criterion with anti-gaming safeguards#120
that-github-user merged 1 commit into
mainfrom
issue-119-copeland-test-criterion

Conversation

@that-github-user

Copy link
Copy Markdown
Owner

Summary

  • Split Copeland filesChanged into nonTestFiles (fewer=better) and testFiles (more=better)
  • Anti-gaming: test files only count when agent also changed production code
  • Cap at 3 test files: prevents domination by sheer volume
  • Helper function effectiveTestFileCount() for testability
  • Updated display: Scope + TestCov columns in Copeland table
  • 3 new tests: test-beats-no-test, test-only-no-bonus, cap-prevents-gaming

Generated by thinktank Opus — 5 agents, 4/5 pass. All 3 passing agents independently implemented the same anti-gaming logic (nonTestFiles > 0 check + Math.min cap). Strong consensus.

Change type

  • New feature

Related issue

Closes #119

How to test

npm test  # 164 tests pass
thinktank run "task" -n 3  # Copeland table now shows Scope + TestCov columns

Breaking changes

  • CopelandScore type changed: filesChangedWinsnonTestFilesWins + testFilesWins

🤖 Generated with thinktank (Opus)

Split filesChanged into nonTestFiles (fewer=better) and testFiles
(more=better, capped at 3). Anti-gaming: test files only count when
agent also changed production code. Prevents score inflation via
empty test files.

4 criteria now: tests passed, convergence, code scope, test coverage.

Generated by thinktank Opus (5 agents, 4 pass, all 3 passing had
identical anti-gaming logic — strong consensus on the approach).

Closes #119

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@that-github-user that-github-user merged commit bdbed66 into main Mar 29, 2026
4 checks passed
@that-github-user that-github-user deleted the issue-119-copeland-test-criterion branch March 29, 2026 01:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add test coverage as 4th Copeland scoring criterion

1 participant